perm filename BALZR3.IJC[ESS,JMC] blob
sn#026602 filedate 1973-02-15 generic text, type T, neo UTF8
pdq
A GLOBAL VIEW OF AUTOMATIC PROGRAMMING
ROBERT M. BALZER
USC INFORMATION SCIENCES INSTITUTE
ISI/RR-73-9
4676 ADMIRALTY WAY
MARINA DEL REY, CALIFORNIA 90291
(213) 822-1511
FEBRUARY 1973
This research is supported by the Advanced Research Projects
Agency under Contract No. DAHC15 72 C 0308, ARPA Order No.
2223/1, Program Code No. 3D30 and 3P10.
Views and conclusions contained in this study are the author's
and should not be interpreted as representing the official
opinion or policy of the University of Southern California or any
other person or agency connected with it.
A Global View of Automatic Programming Page 2
Robert M. Balzer
PREFACE AND ACKNOWLEDGEMENT
This paper represents the author's personal view of a global
description of Automatic Programming. This view resulted from
the author's discussions with and suggestions from numerous
colleagues on this area for which he is deeply indebted.
This paper is a condensation of this view, taken from a
larger work [1] which attempts to structure the field on the
conceptualization expressed here.
A Global View of Automatic Programming Page 3
Robert M. Balzer
INTRODUCTION
The goals of automatic programming are deceptively simple;
namely, the effective utilization of computers. This implies
both simplicity of use and efficient application of the computing
resources.
Thus, automatic programming is the application of a
computing system to the problem of effectively utilizing that or
another computing system in the performance of a task specified
by the user. Although this is certainly what we mean by
automatic programming, this definition does little to restrict
the set of applicable computer systems included in the automatic
programming domain. All compilers, operating systems, debugging
systems, text editors, etc., fall into this domain and, in fact,
the term 'automatic programming' itself was applied to early
compiler systems during the 1950s.
What we need therefore is to either appropriately restrict
the definition of automatic programming, or to provide a set of
criteria used for measuring the acceptability or performance of
an automatic programming system. Such a measure of system merit
is extremely hard to arrive at but would contain the following
man, machine, and system applicability components:
The efficiency of the resulting program;
The amount of computer resources expended in
arriving at that program;
A Global View of Automatic Programming Page 4
Robert M. Balzer
The elapsed time used in arriving at the resulting
program;
The amount of effort expended by the user in
specifying the task;
The reliability and ruggedness of the resulting
program;
The ease with wich future modifications can be
incorporated;
and finally,
The range and complexity of tasks which can be
handled by the system.
THE BASIC MODEL
We conjecture that the solution of every
computable problem can be represented entirely in
problem domain terms as a sequence, which may involve
loops and conditionals, of actions in that domain which
affect a data base of relationships between the
entities of the domain. Included either as part of the
data base or as a separate part of the model, is the
history of the model (i.e., the sequence of actions
applied to the model). This logically completes the
model and enables questions or actions involving
historical information to be handled. In a strong
sense, such a solution is a direct simulation of the
A Global View of Automatic Programming Page 5
Robert M. Balzer
domain. The system models at each step what would
occur in the domain.
The important part of the above conjecture is that
any computable problem can be solved, and hence
described, in problem domain terms. This enables us to
divide the solution into two parts, an external and an
internal part. The external part is the problem
specification given by the user in completely domain
specific terms. The requirements for such users is no
longer a comprehensive knowledge of computers, but
rather the ability to completely characterize the
relevant relationships between entities of the problem
domain and the actions in that domain. In addition,
such users should have a rough awareness of the problem
solving capability of the system so that they can
provide additional help where needed in the form of
more appropriate macro-actions, recommendations about
the use of certain actions, and/or imperative sequences
which will solve part or all of the problem in problem
related terms.
The internal part is first concerned with finding
a solution in problem related terms, if this has not
already been provided by the user. Second, this part
is concerned with finding efficient solutions given the
available computing resources. Such optimizations
A Global View of Automatic Programming Page 6
Robert M. Balzer
occur at two levels beyond what we normally consider
optimization. First, at the problem level, recognition
that certain entities and/or relationships are
irrelevant enables their removal from the model.
Second, since only part of the state of the modelled
domain is required, and only at certain points in the
solution process, rather than simulating the model
completely at each step, the system can employ
alternative representations which require less
maintenance and which either directly mirror the
required part of the domain state or allow such parts
to be computationally inferred. Such representations
may also enable more direct solution of the problem.
It is these optimizations which form the main
distinction between the code generation part of an
Automatic Programming system and current
state-of-the-art compilers.
Thus, our definition of an Automatic Programming
system is one which accepts a problem in terms of a
complete model of the domain, which obtains a solution
for the problem in terms of this model, and produces an
efficient computer implementation of this solution in
the form of a program.
.SYSTEM REQUIREMENTS
There are eight facilities that we desire in
A Global View of Automatic Programming Page 7
Robert M. Balzer
Automatic Programming Systems. The first is an
interactive system. We want interaction between the
system and the user so that the specification can be
given incrementally and any errors or discrepancies
that arise in or from such specification can be handled
as they occur.
Along with this interactiveness we also expect the
system to be very forgiving. It should allow great
flexibility in the way and time at which information is
specified. It should also be forgiving by allowing the
user to change or retract things that he has specified.
The second criteria is the amount of
non-proceduralness used in the specification of the
task to be performed. As far as possible we should
tell the system what it is we want to do rather than
how to do that process. There is a continuum between
the statement of a problem in terms of its initial
state and its goal state and a specification of how to
do it in a machine language. Most of computer languge
development can be viewed as a movement from specifying
HOW to do something towards a statement of WHAT is
desired. The level of non-proceduralness achievable
within an Automatic Programming Sytem is directly
related to the system's capability of turning goals
into actions and this is dependent upon its knowledge
A Global View of Automatic Programming Page 8
Robert M. Balzer
of how to acieve certain results in the problem domain.
We use the ability of the system to achieve results in
the problem domain as the main distinction between
non-procedural and procedural languages. Thus we
require that the problems be stated in a language
appropriate for that domain, i.e., one that can express
the structure and interrelationships of the entities
within that domain, and one that users are familiar
with for discussing and describing tasks and problems
within that domain. Only with such a language can the
system know how to achieve the desired results rather
than being directly told how to produce the desired
result. On the other hand, some actions can best be
described in terms of how to accomplish them rather
than by the resulting state. We have no bias against
or for such descriptions as long as they are specified
entirely in problem domain terms rather than
implementation computer terms.
Along with a non-procedural language for
specifying problems we also want to reduce the amount
that must be specified for the system to correctly
process the problem. This can best be done by removing
details from the specification and allowing the system
to fill them in. As with non-proceduralness there is a
continuum here but the level that we are aiming for is
one which omits from the specification all references
A Global View of Automatic Programming Page 9
Robert M. Balzer
to any entities not contained within the problem
domain. Specifically, we wish to avoid references to
the data structures of computer science; lists, rings,
arrays, symbol tables, and the like, that have been
invented as a means of specifying 'how' rather than
'what' to do.
Next we need a mechanism to enable us to change
specifications that have been previously entered either
because they don't work or because we want them to do
different things, or because the environment in which
they are operating has changed. Such specifications
should be given in terms of changes in desired results.
Again, entirely within the language of the problem
domain itself.
Once a problem has been specified we need a
mechanism for insuring that the system produced is the
one that we desired. This is especially critical since
the system will be performing much augmentation for us.
Discrepancies between the desired system and the one
produced can arise from an inadequate knowledge of the
system about the domain, from a misstatement by the
user, or from a misinterpretation of something that was
specified. Whatever the cause, it is important that
the user can see the produced system in operation in
his own terms, i.e., in the language of the problem
A Global View of Automatic Programming Page 10
Robert M. Balzer
domain itself, so that he can check the expected
behavior against the behavior produced. In addition,
we expect that the system should be able to generate
test input data so that a wide range of behavior of the
task specified can be observed by the user. If a
discrepancy is found the user additionally requires
capabilities to locate and isolate the source of the
discrepancy and then to modify it to obtain the desired
result.
After a correct program has been produced we need
a mechanism for the system to make this program
efficient. Such efficiency will rest on two kinds of
information. First, knowledge about the problem domain
which enables alternative ways of performing the same
task to be evaluated. Secondly, information about
efficient ways to utilize computers so that a total
cost can be assigned to each of the different ways of
solving the problem in domain terms. What we have
implied but not stated in the above is that the goal
Automatic Programming systems are not just providing
facilities to automate programming, but facilities
which help the user in moving from an understanding of
a task to be accomplished to a finished running system
which performs that task, i.e., an Automatic
Programming system is a system which aids the user in
all the steps from problem definition and design to
A Global View of Automatic Programming Page 11
Robert M. Balzer
final completed running programs.
The final criteria for the Goal Automatic
Programming system is the range of their applicability.
To meet all the above criteria the system requires
detailed knowledge about the problem domain. The
requirement for this knowledge limits the system
applicability to other areas, and hence one measure of
such systems is the range of problem domains which they
can adequately handle and their method of obtaining
this knowledge.
A Global View of Automatic Programming Page 12
Robert M. Balzer
ALTERNATIVE DEFINITIONS AND APPROACHES
In defining Automatic Programming we started with
a goal we wished to accomplish, namely, reducing the
effort required to get a task running on a computer.
From this we adopted a framework in which the external
characteristics of Automatic Programming Systems could
be described in terms of:
1. The terms in which the problem is stated;
2. The method and time at which the system acquires
the
knowledge of the problem domain;
3. The characteristic of the resulting program.
The choices we made are:
1. Problem statement in natural language in terms of
the
problem domain.
2. Knowledge about the domain acquired interactively
in
natural language in terms of the complete model of
the
problem domain.
3. Resulting programs which are optimized with
respect
to data representations, control structure, and
code.
A Global View of Automatic Programming Page 13
Robert M. Balzer
Other choices could certainly be made, and the
resulting systems might look very different. One
particular set of choices, suggested by Al Perlis,
represents a system based on a completely different
paradigm. His system is predicated on incremental
growth from an accepted base, namely, FORTRAN. This
idea is to increase the declarative parts of the
language at the expense of the procedural. The
declarative parts are in turn replaced by a series of
questions from the system which specify how a defined
concept is to be used. For instance, the concept
'array' might generate queries to see whether the size
was dynamic at run time, whether insertions or
deletions are being done, whether the elements are
homogeneous, and whether they are accessed
sequentially. From such questions and the programs
which use these concepts, an optimal representation can
be chosen.
In this system, higher levels occur when enough
example systems have been generated and understood by
humans that someone can codify this knowledge and
introduce a new level of semantics and questions.
There is no doubt that such a system would be
useful, and it has the appealing attribute that the
facilities of the system can be incrementally expanded
A Global View of Automatic Programming Page 14
Robert M. Balzer
from a widely distributed and available base. Also,
such a system could instantly be used by existing
programmers who could gradually learn to use the new
facilities on top of their ability to use FORTRAN.
On the other hand, it is not clear how far such a
technique can be pushed, and whether this is the best
way to achieve our goal except in the short-term. This
and other choices and approaches should, however, be
investigated.
A Global View of Automatic Programming Page 15
Robert M. Balzer
THE FOUR PHASES: AN OVERVIEW
Automatic Programming begins with the application
of problem solving to problem statements rather than
problem solutions; i.e., with the attempt by a computer
system to obtain an understanding of the task being
specified. Once the task has been understood, if it is
not in process form, it must be transformed into one.
This is the traditional area of Artificial Intelligence
and human program design. It must be verified that the
resulting process model is the one desired by the user
and that it is adequate for the user's problem. If
not, it must be modified and transformed by the above
steps and reverified. It must then be made into an
efficiently running program. This involves the
automation of the ad-hoc knowledge of computer science.
A complete Automatic Programming system thus
consists of four major phases: Problem Acquisition,
Process Transformation, Model Verification, and
Automatic Coding. Problem Acquisition is the process
by which the system obtains (1) a description of the
problem to be solved or task to be performed in a form
processable by the system, and (2) the knowledge needed
to solve the problem. The result of this phase is a
well formed problem and knowledge base which can be
manipulated by the system and transformed into a high
A Global View of Automatic Programming Page 16
Robert M. Balzer
level process for solving the problem during the second
phase. The third phase is used to verify that this
process is the one desired and that it is adequate for
the problem solution. The fourth phase, Automatic
Coding, fills in the necessary details, optimizes the
process, and produces the actual code to solve the
problem.
A Global View of Automatic Programming Page 17
Robert M. Balzer
THE AUTOMATIC PROGRAMMING MODEL
One of the most striking and deep rooted features
of the Automatic Programming model presented here is
the interface it creates between a high level external
specification of a problem which omits data structures
which are not part of the domain and the internal
implementation of that specification is an efficient
representation.
This choice of a basic interface has predicated
large parts of the entire model. The reason for this
choice as the basic interface within the Automatic
Programming model is that through it we expect to gain
in four important ways.
First, the complete model conjecture states that
such a division is feasible for stating and solving
domain dependent problems.
Second, since the choice of a data representation
and the maintenance of its consistency occupy such a
large portion of current programs we expect that the
size and complexity of specifications without such
representations will be drastically reduced over those
which retain such representations.
Third, since so much detail has been removed from
the specification it is easier for the system to
A Global View of Automatic Programming Page 18
Robert M. Balzer
understand what the task is rather than getting lost in
the details of what is going on.
Finally, since the problem has not been
overspecified with a particular choice of
representation so that the problem was expressible, the
system is now free to choose a representation that will
efficiently solve the problem at hand. The system has
been given increased flexibility in its choice and may
well outperform humans in correctly making
representation choices. Not because the system is more
intelligent than the user, but because it can cycle
through more possibilities and bring to bear a greater
level of effort in such optimizations than any user is
willing or able to invest in such issues.
A Global View of Automatic Programming Page 19
PROBLEM ACQUISITION
PROBLEM ACQUISITION
The Problem Acquisition phase is concerned with
obtaining an understanding of the users problem and the
domain in which it exists so that the Process
Transformation phase can attempt to find a sequence of
transformations or operations in that domain which will
obtain the solution required by the user. Thus, the
Problem Acquisition phase is concerned with building
the complete model of the users domain which represents
the interactions between the entities of that domain
and the effect on those entities by the allowed
transformations or operations applicable within the
domain. It is our primary contention that only through
the development of such a complete model of the user's
domain can the Automatic Programming system have any
degree of generality in the domains for which it is
applicable.
Currently, all such models of user domains have
been coded into a system. We are proposing that such
models can be specified to the system by its users and
that through these models the system can acquire the
knowledge necessary to solve problems within these
domains and to understand what is required for such a
solution. The two main issues, then, are what
constitutes an adequate and appropriate model and how
A Global View of Automatic Programming Page 20
PROBLEM ACQUISITION
is such a model specified or communicated to the
system.
There are basically two types of models,analogical
and fregean. Analogical models bear a strong
resemblence to the structure of the object being
described, such as the floor plan for a room, or the
diagram model used by Gerlernter in his geometry
proving program.
Fregean systems, on the other hand, are linguistic
or relation based, in which expressions are built up on
the relation between functions and arguments to those
functions.
Since one of the basic goals of the Automatic
Programming system is the generality of problem domains
that it is willing and capable of handling we have
chosen to take the fregean model approach.
We can now define the adequacy and appropriateness
of the models for the Automatic Programming system. A
fregean model is adequate if it contains a complete
enough description of the relations between the
entities in the problem domain that a sequence of
operations or transformations on this model can be
built to solve the problem posed by the user. This is
what we referred to as the complete model in the
A Global View of Automatic Programming Page 21
PROBLEM ACQUISITION
Desiderata Chapter. Thus, the adequacy of a model is
dependent upon the use of that model to solve the
problem. Operationally, this requires that the
Automatic Programming system is capable of finding the
complete set of applicable transformations on the model
and can calculate the consequences of each of these
actions. The appropriateness of the model is a measure
of how well suited the available transformations are to
solving the problem at hand, i.e., an adequate model
can be made more appropriate by adding to it
non-primitive transformations made up of a sequence of
primitive ones, which are suitable building block for
the problem being posed. The model may also be made
more appropriate by including recommendations about the
suitability of alternative strategies for sequences of
model transformations.
We expect that the well-known problem of building
a powerful general purpose problem solver will be
significantly aided by the user's tailoring of the
specified model to make it more appropriate for the
problem at hand.
The state of the art in natural language
understanding is adequate for the description of
problems and of models and is beyond our ability to
utilize the information thus obtained, and hence,
A Global View of Automatic Programming Page 22
PROBLEM ACQUISITION
should not be a bottleneck in an Automatic Programming
system. Evidence for this viewpoint comes from the
work of Woods in the Moonrocks Program, from the work
of Winograd in the Blocks Description Program, and the
work of Martin Kay in the Mind System. Each of these
systems represents an alternative linguistic technology
and each is capable of handling a wide range of
linguistic forms within the domain of its competence.
The basic viewpoint then is to process the users
natural language communication with the understanding
that it is meant to convey to the Automatic Programming
system a model of his problem domain. Towards this end
the system can extract entities and the relationships
between them from the communication. It can further
query the user as to the relationships between entities
which have not, as yet, been explicitly specified but
which have been inferred by the previous communication.
Such inferences by the system about the incompleteness
of the model require a sophisticated understanding not
only of the communication but of the types of models
used for problem domain specification. Unfortunately,
our sophistication in both these areas is quite
limited. In communication we need to be able to
understand how information is ordered for presentation,
how context is established and utilized, how the
capabilities of the recipient effects the communication
A Global View of Automatic Programming Page 23
PROBLEM ACQUISITION
and how these capabilities are perceived by the
speaker. In modelling we need to have a space of
possible models, an understanding of how the parts of a
model interact, a means for recognizing incompleteness
in models, as well as inconsistencies, a means for
obtaining all the allowed operations on the model, and
the means for transforming the models with these
operations.
PROCESS TRANSFORMATION
Our contention is that the main activity in
programming is not finding a solution but in finding a
solution which drops out the irrelevancies and which
abstracts the necessary processing so that it can be
efficiently implemented. We recognize that this is a
strong contention but we feel that in most programming
problems a solution is known and the main concern is in
finding a more efficient one. We are not talking here
about optimization in the normal sense. We are
concerned here with finding irrelevancies in the
complete model and representational abstractions based
on the required processing of that model. Once these
logical representations have been found they must be
efficiently implemented.
The above contention, if true, greatly shifts the
emphasis within the Process Transformation Phase from
A Global View of Automatic Programming Page 24
PROBLEM ACQUISITION
that of a general problem solver solving problems in a
domain independent way to modifying a solution so that
it does not maintain any irrelevant portions of the
complete model and which abstracts the relevant
portions into a more efficient representation for the
processing required. Together with Problem Acquisition
the ability to find representational abstractions and
transform complete model solutions into ones which
utilize these representations represent the main
technological deficiencies with obtaining an Automatic
Programming system.
MODEL VERIFICATION
Although we can be assured that the Automatic
Coding phase will produce only correct code, Program
Testing cannot disappear. This is because the Problem
Acquisition Phase and the Process Transformation Phase
will undoubtedly employ a number of heuristics and may
very well incorrectly interpret either the problem
statement or the allowed transformations that can occur
in the user's model. Because of this, the user must
verify that the system created is the one that he
desired.
The technology for this is at hand. It consists
of today's methods wherein a test case is given to the
system and its performance is used to validate the
A Global View of Automatic Programming Page 25
PROBLEM ACQUISITION
model that it constructed. Additionally, the system
can aid the process by generating test cases of its own
which probe areas of which it is uncertain and which
could have led to either misunderstanding or
incompleteness in the original model. One might also
expect that program debugging would disappear, but for
very similar reasons it too will remain under Automatic
Programming. If there is a disparity between the users
model and the system's model, then the reason for this
disparity must be obtained.
AUTOMATIC CODING
Automatic Coding is concerned with finding an
efficient computer implementation of the process
description obtained from the proceeding phase. This
description does not yet include a choice of data
representations, but does specify the major processing
elements and sequences. It is intended that this phase
will not need any domain specific knowledge except for
input frequency and distribution information. The
major logical representation and processing decisions
have already been made by the Process Transformation
phase.
Of all the phases in the Automatic Programming
System, the Automatic Coding one is the one essential
component of any Automatic Programming System. Without
A Global View of Automatic Programming Page 26
PROBLEM ACQUISITION
it the system cannot produce programs, and hence,
though it may be useful it is not an Automatic
Programming System.
Most people are not truly creative when they
reorganize sections of their program to increase
efficiency. Rather than inventing totally new
representations they appear to select one out of an
ill-defined set of such possible representations and to
adapt and modify it to function in the current
situation. This is probably the main challenge to the
Automatic Coding phase, the ability not only to cycle
through a set of alternative representations but to
adapt and modify them to the existing situation. Such
an ability would vastly increase the applicableness of
a small set of alternative representations.
From such Automatic Coding studies, one would
expect to see both a set of heuristics and a calculus,
eventually, for data representation choices.
A Global View of Automatic Programming Page 27
PROBLEM ACQUISITION
REFERENCES
1. AUTOMATIC PROGRAMMING, R. M. BALZER Institute
Technical Memorandum ISI/RR-1 USC INFORMATION SCIENCES
INSTITUTE September 1972